Fuzzy ILP Classification of web reports after linguistic text mining

نویسندگان

  • Jan Dedek
  • Peter Vojtás
  • Marta Vomlelová
چکیده

In this paper we study the problem of classification of textual web reports. We are specifically focused on situations in which structured information extracted from the reports is used for classification. We present an experimental classification system based on usage of third party linguistic analyzers, our previous work on web information extraction, and fuzzy inductive logic programming (fuzzy ILP). A detailed study of the so-called ‘Fuzzy ILP Classifier’ is the main contribution of the paper. The study includes formal models, prototype implementation, extensive evaluation experiments and comparison of the classifier with other alternatives like decision trees, support vector machines, neural networks, etc.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

متن کامل

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

متن کامل

WebGD: A CORBA-based Document Classification and Retrieval System on the Web

This paper presents the design and implementation of the WebGD, a CORBA-based document classification and retrieval system on Internet. The WebGD makes use of such techniques as Web, CORBA, Java, NLP, fuzzy technique, knowledge-based processing and database technology. Unified classification and retrieval model, classifying and retrieving with one reasoning engine and flexible working mode conf...

متن کامل

Discovery of Fuzzy Multiple-Level Web Browsing Patterns

Web usage mining is the application of data mining techniques to discover usage patterns from web data. It can be used to better understand web usage and better serve the needs of rapidly growing web-based applications. Discovery of browsing patterns, page clusters, user clusters, association rules and usage statistics are some usage patterns in the web domain. Web mining of browsing patterns i...

متن کامل

Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms

Corresponding Author: Beatriz Wilges Department of Information Systems, Federal University of Santa Catarina, Florianópolis, Brazil Email: [email protected] Abstract: The ever-increasing amount of information on the Web is organized in structured, semi-structured and unstructured data. Text classification systems, capable of handling such different structures, may facilitate the work of im...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Process. Manage.

دوره 48  شماره 

صفحات  -

تاریخ انتشار 2012